A Concise Information-Theoretic Derivation of the Baum-Welch algorithm
نویسندگان
چکیده
We derive the Baum-Welch algorithm for hidden Markov models (HMMs) through an information-theoretical approach using cross-entropy instead of the Lagrange multiplier approach which is universal in machine learning literature. The proposed approach provides a more concise derivation of the Baum-Welch method and naturally generalizes to multiple observations. Introduction The basic hidden Markov model (HMM)[5] is defined as having a sequence of hidden or latent states Q = {qt} = {q1, q2, . . . , qT } (where t denotes time interval), and each state is statistically independent of all but the state immediately before it and where each state emits some observation ot with a stationary (non-time-varying) probability density. Formally, the model is defined as: p(O,Q|λ) = p(q1|λ) [ T ∏ t=2 p(qt|qt−1, λ) ] [ T ∏
منابع مشابه
Comparing the Bidirectional Baum-Welch Algorithm and the Baum-Welch Algorithm on Regular Lattice
A profile hidden Markov model (PHMM) is widely used in assigning protein sequences to protein families. In this model, the hidden states only depend on the previous hidden state and observations are independent given hidden states. In other words, in the PHMM, only the information of the left side of a hidden state is considered. However, it makes sense that considering the information of the b...
متن کاملGeneralized Baum-Welch and Viterbi Algorithms Based on the Direct Dependency among Observations
The parameters of a Hidden Markov Model (HMM) are transition and emission probabilities‎. ‎Both can be estimated using the Baum-Welch algorithm‎. ‎The process of discovering the sequence of hidden states‎, ‎given the sequence of observations‎, ‎is performed by the Viterbi algorithm‎. ‎In both Baum-Welch and Viterbi algorithms‎, ‎it is assumed that...
متن کامل4.1 Overview
In this lecture, we will address problems 3 and 4. First, continuing from the previous lecture, we will view BaumWelch Re-estimation as an instance of the Expectation-Maximization (EM) algorithm and prove why the EM algorithm maximizes data likelihood. Then, we will proceed to discuss discriminative training under the maximum mutual information estimation (MMIE) framework. Specifically, we will...
متن کاملDiscriminative speaker adaptation with conditional maximum likelihood linear regression
We present a simplified derivation of the extended Baum-Welch procedure, which shows that it can be used for Maximum Mutual Information (MMI) of a large class of continuous emission density hidden Markov models (HMMs). We use the extended Baum-Welch procedure for discriminative estimation of MLLR-type speaker adaptation transformations. The resulting adaptation procedure, termed Conditional Max...
متن کاملModel Theoretic Reformulation of the Baum-connes and Farrell-jones Conjectures
The Isomorphism Conjectures are translated into the language of homotopical algebra, where they resemble Thomason’s descent theorems.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1406.7002 شماره
صفحات -
تاریخ انتشار 2014